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Abstract Loops are essential secondary structure elements in folded DNA and RNA molecules and prolif- 
erate close to the melting transition. Using a theory for nucleic acid secondary structures that accounts for 
the logarithmic entropy — clnm for a loop of length m, we study homopolymeric single-stranded nucleic 

S acid chains under external force and varying temperature. In the thermodynamic limit of a long strand, 
the chain displays a phase transition between a low temperature / low force compact (folded) structure 
and a high temperature / high force molten (unfolded) structure. The influence of c on phase diagrams, 
critical exponents, melting, and force extension curves is derived analytically. For vanishing pulling force, 
only for the limited range of loop exponents 2 < c < 2.479 a melting transition is possible; for c < 2 the 
chain is always in the folded phase and for 2.479 < c always in the unfolded phase. A force induced melting 
transition with singular behavior is possible for all loop exponents c < 2.479 and can be observed exper- 
imentally by single molecule force spectroscopy. These findings have implications for the hybridization or 
denaturation of double stranded nucleic acids. The Poland-Scheraga model for nucleic acid duplex melting 
does not allow base pairing between nucleotides on the same strand in denatured regions of the double 
strand. If the sequence allows these intra-strand base pairs, we show that for a realistic loop exponent 
cRi 2.1 pronounced secondary structures appear inside the single strands. This leads to a lower melting 
O temperature of the duplex than predicted by the Poland-Scheraga model. Further, these secondary struc- 

tures renormalize the effective loop exponent c, which characterizes the weight of a denatured region of 
I the double strand, and thus affect universal aspects of the duplex melting transition. 
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1 Introduction 



Ribonucleic acids continue to stay in the focus of experimentalists and theorists pQ. The advance of single molecule 
techniques [2]-[7] nowadays allows to study single chains of nucleic acids under tension and varying solution conditions 
and thereby yields unprecedented insights into the behavior and folding properties of these essential molecules. Theory 
on RNA folding vastly relies on the idea of hierarchical folding proposed by Tinoco et al. 19] , stating that given a 
sequence (the primary structure), the secondary structure (i.e. the list of all base pairs) forms independently of the 
5— i tertiary structure (the overall three-dimensional arrangement of all atoms). This is in contrast to the protein folding 
problem, which does not feature these well separated energy scales between the different structural levels and hence is 
more involved [TU]. The idea of hierarchical folding therefore suggests to solely focus on the secondary structure, i. e. 
the base pairs, and to neglect the tertiary structure. This constitutes a major simplification and enables to calculate 
partition functions exactly and to predict the secondary structure formed by a given RNA sequence. De Gennes pj] was 
the first to calculate the partition function of an ideal homopolymeric RNA chain by using a propagator formalism 
and solving the partition function by means of a singularity analysis of the generating functions. Due to his real 
space approach for an ideal polymer, the loop exponent was fixed at c = 3/2. The loop exponent c characterizes the 
logarithmic entropy contribution cx lnm~ c of a loop of length m. Ten years later, Waterman and Smith [T^] devised 
a recursion relation appropriate for the partition function of folded RNA, which now lies at the heart of most RNA 
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secondary structure and free energy prediction algorithms currently used. Since all results obtained for RNA are also 
valid for DNA on our relative primitive level of modeling, we will mostly explicitly refer to RNA in our paper but 
note that in principle all our results carry over to single stranded DNA molecules, as well. Subsequently, several 
theoretical models were developed to study RNA and DNA: those were focused on melting [T3HT5] . stretching [l6l - 
|2"T] . unzipping [55], translocation [53], salt influence [2~4"H2"o] . pseudoknots [57[ [5B], and the influence of the loop 
exponent [T3J [TBI GIB GH] ■ In this context, an interesting question in connection with the melting of double stranded 
DNA (dsDNA) arises: Do secondary structure elements form in the single strands inside denatured dsDNA loops or 
not? Formation of such secondary structures in dsDNA loops would mean that inter-strand base pairing between the 
two strands - being responsible for the assembly of the double helix - is in competition with intra-strand pairing, 
where bases of the same strand interact. This question is not only important for the thermal melting of dsDNA but 
also for DNA transcription, DNA replication and the force- induced overstretching transition of DNA [3J [3TJ . 

In this paper, the influence of the loop exponent c on the behavior of RNA subject to varying temperature and 
external force is studied, which goes beyond our previous work where only the temperature influence was consid- 
ered [13]. We neglect sequence effects and consider a long homopolymeric, single stranded RNA molecule. A closed 
form expression for the partition function is derived, which allows to study the thermodynamic behavior in detail. The 
phase diagram in the forcc-tcmperaturc plane is obtained. Wc find that the existence of a temperature induced phase 
transition depends crucially on the value of the loop exponent c: at vanishing force a melting transition is possible only 
for the limited range of loop exponents 2 < c < 2.479. c 2.1 is a typical exponent that characterizes the entropy of 
loops usually encountered in RNA structures - hairpin loops, internal loops, multi-loops with three or more emerging 
helices. That means that RNA molecules are expected to experience a transition between a folded and an unfolded 
state. This is relevant for structure formation in DNA or RNA single strands and can in principle be tested in double 
laser trap force clamp experiments [32]. Our findings also have implications for the denaturation of double helical 
nucleotides (e. g. dsDNA). Since intra-strand and inter-strand base pairing compete, secondary structure formation of 
the single strands inside denatured regions of the duplex has to be taken into account in a complete theory of dsDNA 
melting. In the case where intra- and inter-strand base pairing occurs, the classical Poland-Schcraga mechanism for 
the melting of a DNA duplex has to be augmented by the single-strand folding scenario considered by us, as the 
Poland-Scheraga theory is only valid in the case where no intra-strand base pairing is possible. If the intra-strand 
interactions are strong enough to induce folded secondary structures, we show that the loop exponent governing the 
entropy of inter-strand loops is renormalized and takes on an effective universal value that only depends on whether 
the inter-strand loops are symmetric (consisting of two strands of the same length) or asymmetric. The resulting 
duplex melting transition is universal and turns out to be strongly discontinuous in the first, symmetric case, and on 
the border between continuous and discontinuous in the second, asymmetric case. In the case when intra-strand base 
pairing is weak and a single strand is in the unfolded phase, the situation is different and qualitatively similar to the 
original Poland-Scheraga results, yet with a lower melting temperature. All these effects can be studied experimentally. 
We make explicit suggestions for dsDNA sequences, with which the formation of intra-strand secondary structures 
inside inter-strand loops can be selectively inhibited or favored. 



2 Derivation of partition function 
2.1 Model 

Single stranded RNA is modeled as a one-dimensional chain. The basic units are nucleotides with the four bases 
(cytosine (C), adenine (A), guanine (G), uracil (U)), which are enumerated by an index i = 1, . . . ,N. A nucleotide can 
establish a base pair (bp) with another nucleotide via hydrogen bonding, leading to helices and loops as the structural 
building blocks of an RNA secondary structure, see fig. [Jj In agreement with previous treatments, a valid secondary 
structure is a list of all base pairs with the constraint that a base can be part of at most one pair. In addition, 
pseudoknots are not allowed, that means that for any two base pairs and (fc, I) with i < j, k < I, and i < k we 
have either i < k < I < j or i < j < k < I |28j . This imposes a hierarchical order on the base pairs, meaning that two 
base pairs are either nested and part of the same substructure or are independent and part of different substructures, 
fig. [2] Helix stacking - the interaction of two helices emerging from the same loop - and base triples as well as the 
overall three-dimensional structure are not considered. Base pairs are stabilized by two different interactions. First, 
by hydrogen bonds between complementary bases and, second, by the stacking interaction between neighboring base 
pairs, which are accounted for by the sequence independent parameters g^ h and gjf ack , respectively. A helix with h 
base pairs consequently has the free energy /i(gjj b + g^ ack ) — g^ ack + <4> where 9h 1S a helix initiation free energy. 
Therefore, the hydrogen bonding and the stacking interaction can be combined to yield the binding energy per base 
pair e = — (g^ h + <7jf ack ), which we define to be positive, e can be measured experimentally by duplex hybridization |33j 
and contains the binding free energies as well as the extensive part of configurational polymer entropy. Further, the 
stacking interaction appears as an additional contribution to the helix initiation free energy. We define C/™' = fl^— flff 
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Figure 1: Schematic representation of the secondary structure of an RNA molecule. Dots represent one base, i.e. 
cytosine, guanine, adenine, or uracil. Solid lines denote the sugar-phosphate backbone bonds, broken lines base pairs, 
and thick gray lines the non-nested backbone bonds, which are counted by the variable M, here M = 11. The thick 
arrows to either side illustrate the force F applied to the 5'- and 3'-end. 
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Figure 2: The arc diagram is a representation of the secondary structure depicted in fig. [I] A dot represents one 
base. Solid lines denote the backbone bonds and thick gray lines the non-nested backbone bonds. Dashed arcs denote 
hydrogen bonds between two bases. A pseudoknot (dotted arc) is recognized here as crossing arcs. If no pseudoknots 
are present the structure is hierarchical, meaning that substructures are either nested or juxtaposed. 



and describe the binding free energy by a single, sequence independent parameter [T3] 

w = exp(e/(k B T)) . (1) 

w is the statistical weight of a bound base pair, T is the absolute temperature, and ke the Boltzmann constant. 
Therefore this is a model for homopolymeric RNA, which can be realized experimentally with synthetic alternating 
sequences [AU]jv/2 or [GC]jv/2- K has also been argued that this homopolymeric model describes random RNA above 
the glass transition [51] . 

The non-extensive contribution of the free energy of a loop is given by 

£f onf = -k B Tlnm- c (2) 

and describes the entropy difference between an unconstrained polymer and a looped polymer. The loop exponent c 
is Cidcai = 3/2 for an ideal polymer and csaw = dv — 1-76 for an isolated self avoiding loop with v ~ 0.588 in d = 3 
dimensions [35] , However, helices, which emerge from the loop, increase c even further. In the asymptotic limit of long 
helical sections, renormalization group predicts q = dv + oi — la^ for a loop with I emerging helices [30( 136] . where 
a i = el(2 - Z)/16 + e 2 l(l - 2) (81 - 21)/512 + 0(e 3 ) in an e = 4 - d expansion. One obtains a = 2.06 for terminal, 
C2 = 2.14 for internal loops and C4 = 2.16 for a loop with four emerging helices. For larger I the e expansion prediction 
for q becomes unreliable. One sees that the variation of c over different loop topologies that appear in the native 
structures of RNA is quite small, which justifies our usage of the same exponent c for loops of all topologies that occur 
in a given RNA secondary structure. 



2.2 Canonical partition function 

As we neglect pseudoknots, only hierarchical structures are present, which allows to write down a recursion relation 
for the partition function. Further, as we consider homopolymers and omit sequence effects by using a constant base 
pairing weight w, the system is translationally invariant. Hence, the canonical partition function Qf^ of a strand 
ranging from base i at the 5'-end through j at the 3'-end depends only on the total number of segments N — j — i 



4 



Thomas R. Einert et al.: Secondary structures of single-stranded nucleic acids 



M + l 




M 




1 


M 







i j + 1 




i 


J 


fc 


S 


! fc-1 




fc j + 1 







E * 


m 










fc j + 1 




m 


ki 


■1 J 











Figure 3: Illustration of the recursion scheme for the canonical partition function in eq. ([3]). Boxes denote partition 
functions of substrands (the range is given by the subindices) . The numbers inside a box give the number of non- nested 
backbone bonds. To calculate the partition function of a strand ranging from i through j + 1, consider the partition 
function of a strand ranging from i through j and add base number j + which may (right term in first row) or 
may not (left term in first row) establish a base pair with base number k. In the second row Q® J+1 is calculated 
by closing structures with m non- nested bonds with a hydrogen bond (dashed line). For homopolymcric RNA, the 
sequence dependence drops out and only the lengths of the substrands, N = j — i, n = k — 1— i, N — n = j + 1 — k, 
need to be considered, see eq. 



and on the number of non-nested backbone bonds Al. A non-nested bond is defined as a backbone bond, which is 
neither part of a helix nor part of a loop. It is outside all secondary structure elements and therefore contributes to the 
end-to-end extension, which couples to an external stretching force and which can be observed for example in force 



spectroscopy experiments [T31 [2«2 EZj > see figs. [I] and [5j The recursion relations for Q 



jf can be written as 



n M+i _ *t(M + 1) 
Wn+i 



and 



Vf(M) 



In 



N-l 

n=M 



N-n 



(3a) 



N-n-2 

E 

rn = — 1 



Vf(m) 



-G^ejm - 2)/(k B T)) 
(m + 2) c 



(3b) 



and is illustrated in fig. |3| The Heaviside step function is 0{m) = if m < and 0(m) = 1 if m > 0. Eq. (3a) 
describes the elongation of an RNA structure by either adding an unpaired base (first term) or by adding an arbitrary 
substrand Q%^ n that is terminated by a helix. Eq. (3b) constructs Q° N _ n by closing structures with m non-nested 
bonds, summed up in <5^_ rl _ 2 j by a base pair. Vf(M) denotes the number of configurations of a free chain with M 
links and can be completely eliminated from the recursion relation by introducing the rescaled partition function 
Qn = Qn / v f(M). We set ^J" 1 ' = for computational simplicity and combine eqs. (3a) 
final recursion relation 



and ( 3b ) , which leads to the 



Q 



M+l 

N+l 



Q 



N-l N-n-2 fiMnm 

E 

n=M m=— 1 



(m + 2) c 



(4) 



with the boundary conditions Qz\ = 1, Qm = for M > N, N < 0, or M < 0. The thermodynamic limit of an 
infinitely long RNA chain is described by the canonical Gibbs ensemble, which is characterized by a fixed number 
of segments N, but a fluctuating number of non-nested backbone bonds M. Therefore we introduce the unrestricted 
partition function 



Z N (s) 



E sM Qn ' 

M=0 



(5) 



which contains the influence of an external force F via the statistical weight s of a non-nested backbone bond. For 
RNA with no force applied to the ends, one has s = 1. We model the RNA backbone elasticity by the freely jointed 
chain (FJC) model, where the weight of a non-nested backbone bond subject to an external force is given by 



4tt J J /3Fb ss 
here, we introduced the inverse thermal energy /? = (keT) -1 and the Kuhn length b ss 



(0) 
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Figure 4: Structure of the grand canonical partition function Z(z, s) according to eqs. ^ and ( 11 ). The grand canonical 
partition function of the Gibbs ensemble is a sum over all numbers of non-nested backbone bonds (thick gray lines) 
with statistical weight sz. Between two adjacent backbone bonds can be either a single nucleotide (dot), with statistical 
weight 1, or a structure with weight Z^, whose first and last base are paired. The white squares serve as wildcards 
for either possibility and have the statistical weight k(w,z). Thin black lines depict backbone bonds that are part of 
a helix or loop and have statistical weight z. 



2.3 Grand canonical partition function 

For studying the phase transition and the critical behavior, it is useful to introduce the generating function or grand 
canonical partition function 



N ) 



N=0 



N=0 M=0 



where z — exp(/i/ (k^T)) is the fugacity. Performing the weighted double sum X)iv=-i S 
eq. Q yields 

(sz^Z = (sz)- 1 +Z+ ({sz)- 1 + Z) (k - 1) , 

which can be solved and one obtains the generating function 

k(w, z) 



M = 



(7) 



on both sides of 



Z(z,s) = 



1 — szk(w, z) 



(8) 



(9) 



Here we have defined k(w, z) = 1 + Z\ ) (w, z) as the grand canonical partition function of RNA structures with zero 
non-nested backbone bonds, i. e. structures which consist of just one nucleotide or structures where the terminal bases 
are paired, 



k(w, z) = 1 + Z] i (w, z) = 1 + wz 2 



' N 



N=-l M=-l 



(M + 2) 



(10) 



Eq. ([9]) has an instructive interpretation, which becomes clear by expanding the fraction in a geometric series 

00 00 

Z(z,s)= ^s m z m k m+1 = ^(l + Z b ).( S z(l + Z b )) M , (11) 



M=0 



M=0 



where sz is the statistical weight for a backbone segment which is not nested. Between two adjacent segments we have 
the possibility to put either a single nucleotide (with statistical weight 1) or a structure whose first and last bases are 
paired (with statistical weight Z^). See fig. H for an illustration. 

In order to determine the function k(w, z), we compare the coefficients of the power series in s in eqs. (7) and (p) 
and obtain z M n M+1 — X)tv=m zN Qn ■ ^ ne l° wer summation index is due to exchanging the summations in eq. IW, 



bearing in mind that Q^J = for M > N. This identity can be inserted into eq. ( 10 ) and yields the equation 



k(w, z) — 1 



k(w, z) 



1A c (zk(w 1 z)) 



(12) 
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Figure 5: Graphical solu tion of the equations, which determine k(w, z), z^, z p . The functions n — 1 (dashed line) and 
Ii(k,z) (solid lines), eq. (13 1, are plotted for w = 10 and different values of the fugacity z and the loop exponent c, 
(a) c = 0.9 (b) c = 3/2 (c) c = 2.1. Points at which both curves intersect are solutions of eq. ( 12 ) and determine k(w, z). 
In case of two positive solutions the smaller yiel ds t he correct solution as k(w,z) is to be a continuous, monotonically 
increasing function of z with k(w,0) = 1, eq. (10). Points at which both curves are adjacent to each other (open 
circles) determine the branch point z\,. Points at which zk = 1 (filled circles) determine the position of the pole z p in 
the absence of force, s = 1. 



which determines k(w,z). Li c (z«) = 



1 /m c , for zk < 1, is the poly logarithm [35]. We introduce 



h(n, z) — — L1 c (zk) 



(13) 



and rewrite eq. ( 12 1 as k — 1 = h(t y, z) . Eq. ( 12 1 has at most two positive and real solutions as can be seen from fig. [5j 
where we plot the two sides of eq. ( 12 ). n(w, z) is a continuous, monotonically increasing function of z with n(w, 0) = 1 
as follows from eq. (10). Therefore, only the smallest positive root of eq. (12) yields the correct k(w,z). For z — > 
there is always a positive and real solution for k(w,z). Increasing z increases k(w,z) until eventually, at z = z\>, the 
real solution for k(w,z) vanishes. Thus, n(w, z) has a branch point at z = z\>. Depending on the value of the loop 
exponent c, the polylogarithm Li c (z«;) and /i(k, z) (a) are divergent for c < 1, (b) arc finite, but feature a diverging 
slope for 1 < c < 2, or (c) have a finite value and derivative for 2 < c at zk = 1, see fig. [5j This will become important 
later, when the existence of phase transitions is studied. 



2.4 Back-transform to canonical ensemble 

Since the thermodynamic limit N — > oo is defined in the canonical Gibbs ensemble, we now demonstrate how to 
obtain Z^{s), eq. ([5]), from Z(z, s), eq. (J9j) . For large systems, N ^ 1, the canonical partition function is given by the 
dominant singularity z^(s) of Z(z, s), which is defined as the singularity which is nearest to the origin in the complex 
plane |16l I39j . In particular if Z(z, s) <~ K (s) (zd( s ) — z ) ° with K(s) independent of z, we obtain 



z 



Z N (s) ~ z^^N"- 1 ■ K(s)za a (s)/r(a) , (14) 

where r(a) is the gamma function [40] . Therefore, the Gibbs free energy reads to leading orders in TV 

G/(k B T) =-]nZ N ~ N In z d (s) + (1 -a) In N . (15) 

In fact, Z{z, s) features two relevant singularities. First, the branch point z\,(w) of k(w,z), which is independent 
of s, and second a simple pole z p (w,s), where the denominator of Z(z,s) vanishes, see eq. (pM). Depending on which 
singularity has the smallest modulus, the molecule can be in different phases. In the following sections it will turn out 
that the low temperature, compact or folded phase is associated with z^, whereas the high temperature, extended or 
unfolded phase is characterized by z p . 

Let us consider the branch point first. It can be seen from fig. [5] that for z < z\, at least one real solution of 



eq. (12) exists, where the smaller solution determines k(w,z). Right at z — zy, the two solutions merge and the slope 
of h is h! '(k, Zb) — dh(n, z^)/dn = 1 at the tangent point. This yields the equation for the position of the branch point 
singularity Zb(u>), which is a function of w only, 



k(w, z h ) 2 = whi c ^i(z h n(w, z h )) - wLi c (z h K(w, z h )) 



(16) 
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The behavior of k(w, z) in the vicinity of the branch point can be obtained by expanding eq. (12) for z — > z b and 

z b K(w,z h ) < 1 



K,(w,z)~K h -( J K h (l - SZ h K h ) 2 , 

\ Z\y / 

where we used the short notation Kb = k(w, z b ) and defined 



K h = 



2wLi c _ 1 (z b K b ) 



■;Li c _ 2 (2bKb) - wLi c _i(z b K b ) - 2*^/ (1 - sz h n b ) 



1/2 



(17) 



(18) 



Due to the exponent 1/2 in the above equation, the function k(w, z) exhibits a first order branch point at z = z b and 
the grand canonical partition function, eq. Q, scales as 



Z{z,a) 



Kb 



1 - SZ h K h 



Z h -Z\V2 



Z], 



Ku 



Together with eq. (14), we obtain the following scaling for the canonical partition function 
Z N (s)~ z h N N- 3 / 2 K h /V^ , 



(19) 



(20) 



which leads to a logarithmic ./V-contribution with universal prefactor 3/2 to the free energy Q = — kBTlnZ/v, in accord 
with the findings of de Gennes [TT] . It will turn out that eq. ( 20 ) describes the low temperature or folded phase of the 
system. 

Now let us consider the pole singularity z p of the grand canonical partition function. z p (w,s) is a function of w 
and s and is given by the zero of the denominator of Z(z, s) in eq. ([9]), 



sz p k(w, z p ) = 1 



(21) 



The position of the pole can be evaluated in a closed form expression by plugging eq. (21) into eq. (12) and solving 
the resulting quadratic equation for z. One obtains 



z p (w, s) = - (l + Vl + ^Li^l/s)) 
k(w, z p ) = 5 + Vi + ^Li^l/s)) 



(22a) 
(22b) 



The behavior of k(w, z) in the vicinity of the pole can be obtained by expanding eq. (12) for z — > z p , z p k(w,z p ) 
l/s < 1, and c > 2 



K(W, z) <~ ftp — K, 



z p - z wLi c _i(l/s) 
z p K p (2k p - 1) 



where we used the short notation k p = k(w, z p ) and introduced 

2kI - - toLi c _i(l/s) 
p 2« p - 1 

Therefore, the grand canonical partition function scales as 

z( 2 , s) ~(^rVp 

V Z-r, ' 



(23) 



(24) 



(25) 



and together with eq. ( 14 ) we obtain the scaling of the canonical partition function 
Z N (s) ~ 2;^^ . 



(26) 



Later we will see, that eq. (26) describes the denatured high temperature phase of the system. In contrast to the 
branch point phase, eq. (20), no logarithmic contribution to the free energy is present. 



8 



Thomas R. Einert et al.: Secondary structures of single-stranded nucleic acids 




Figure 6: Illustration of the graphical solution of eqs. ( 12 ),( 16 ), and (21 1 at the phase transition. The solid line sketches 
h(n), eq. (13), the dashed line the function k—1. The open circles denote points where the conditions for the branch 
point are met: Curves have common points, eq. (12), and cur ves are tangent to each other in these points, eq. (16). 
The black filled circles denote points where szn — 1, eq. (21). If the black filled circle lies on the dashed curve the 
conditions for a pole are met, eqs. (12) and (21). For a given temperature T and force F the fugacity z is increased 
from z = to the value where either eqs. (12) and (16) hold (open circle on dashed line, folded phase) or eqs. (12) 
and (21 ) hold (black filled circle on dashed line, unfolded phase) . (a) Illustration of the thermal phase transition at zero 
force, s = 1. For low temperatures the branch point is dominant. Upon increasing the temperature, i. e. decreasing w, 
the branch point and the pole approach each other until they eventually coincide at the melting temperature T = T m 
and cause a phase transition. For T > T m there is no branch point anymore and the pole is dominant, (b) Illustration 
of the force induced phase transition at a temperature T < T m . For small forces the branch point is dominant. Upon 
increasing the force, i.e. increasing s, the point where zk = 1/s (black filled circle) moves towards the branch point. 
As the branch point is independent of the force, see eqs. ( 12 ) and ( |16[ ), no observable depends on F as long as F < F c . 
At F = F c the branch point and the pole coincide and a phase transition occurs. For F > F c the pole is dominant. 



3 Critical behavior 



The two singularities Zb(w) and z p (w, s) are smooth functions of external variables such as temperature T or force F, 
which enter via the weight of a base pair w, eq. ([I]), and the weight of a non- nested backbone bond s, eq. As 
the system is described by the singularity, which is closest to the origin, a phase transition associated with a true 
singularity in the free energy, eq. ( 15 1, is possible if these two singularities cross. For that purpose let us shortly review 
the three constitutive equations eqs. ( 12 ) ,( 16 1, and (21). As observed earlier, the smallest positive root of eq. (12) 



yields the function k(w, z). The simultaneous solution of eqs. (12) and (16) yields the branch point Z\ } (w), whereas 
the simultaneous solution of eqs. (12) and (21) yields the pole Zp[w,s), which can be expressed in a closed form, see 
eq. d22l. 



3.1 Critical point and existence of a phase transition 

The critical fugacity z c and thus the phase transition is defined as the point where the branch point and the pole 
coincide 



z c = z h (w c ) = z p (w c ,s c ) , 



(27) 



which means that all three eqs. ( 12 ),( 16 ), and (21 ) have to hold simultaneously, see fig.[6j Assuming that a pair (w c , s c ) 
exists so that eq. (27) is true, this can be evaluated further by plugging eqs. (22) into eq. (16) and we obtain 



Li c _i(l/s c ) - Li c (l/s c ) 
(Li c _i(l/s c )-2Li c (l/s c )) 2 



(28) 



This constitutes a closed form relation between w c and s c or, by employing eqs. ([I]) and ([6|, the critical temperature T c 
and force F c . The melting temperature T m is defined as the critical temperature at zero force. 
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The order of the branch point exactly at T = T m and zero force, s = 1, is calculated by expanding eq. (12 1 in 



powers of z/z c — 1 and k(w, z)/k c — 1 while keeping w = w c fixed. For vanishing force k c = l/z c and we obtain 

K c /Zc-znVO-i) 
k(w,z)~k c -— — I , 

A C ,T \ Z c / 

where we used 

k ( Cc - 1 r 1/(c " x) 

Thus, the asymptotic behavior of the generating function at T = T m and zero force is 

Z x-l/(c-l) 



Z{z lS )~K c . T (^-^-) 



(29) 



(30) 



(31) 



and we obtain the modified scaling of the canonical partition function right at the melting point temperature 

Z N (s = 1) ~ z- N N^I^K c . T /r{l/{c - 1)) . (32) 

We see that the loop statistics are crucial and enter via the loop exponent c, which gives rise to non-universal critical 
behavior. 

For finite force, s > 1, the branch point is first order and the scaling of the generating function at the critical point 
reads 



with 



s Lic_a(l/«) - u> c Lic-i(l/«) - 2** \ V2 



2w c Li c _i(l/s) 
leading to the canonical partition function 

Z N {s) ~ z^N-WlC c ,v/(sV^) , 



(33) 



(34) 



(35) 



with a scaling independent of the loop exponent c. Note that eq. (33) scales as (z c — z) ' in contrast to eq. (19), 
which leads to the different scaling of Zpf(s) in eq. (35) when compared to eq. (20). In the rest of this section we 
compare thermal and force induced phase transition and in particular determine the parameter range in which a phase 
transition is possible. 



3.1.1 Thermal phase transition 



First, we consider the thermal phase transition without external force, i. e. for s = 1 
reduces to the Riemann zeta function, Li c (l) = £ c . Since s = 1, we find that z^ < z- 



positive branch point. This is due to the fact that Z\,k(w, z\,) < 1 (see fig. |5J) 
monotonically increasing function of z, see eq. (10). 



In this case the polylogarithm 
p as long as k(w, z) has a real, 

v) 



z p k(w,z p ) — 1, and that zn(w,z) is a 



No thermal phase transition for c < 2 For c < 2 the function k(w, z) always features a branch point since h'(n)_—> oo 
for k — > 1/z. This ensures that for every w a Zh is found, where h(n, Z\,) is tangent to k — 1, see figs. [5^, and [5|d. As 
the branch point is always dominant, we find the universal scaling 

G/(k B T) = Annz b + 3/21niV (36) 

for all temperatures and no phase transition is possible. The RNA chain is always in the folded phase. 
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No thermal phase transition for c > c* ~ 2.479 For c > 2 the function Ji(k, z) and its derivative are finite for 
zk = 1. A sufficient condition for a branch point to exist is that the slope of h is greater than 1 for zk = 1, see filled 
circles in fig. [5}:, and hence 

h (k(w, z D ), z D ) — . > 1 . (37) 

P P 1 + VI + 4w( c y ' 

This can be achieved always for large enough w as long as the numerator is positive. On the other hand, for c > c* w 
2.479, where c* is the root of 

Cc--i-2C c * =0, (38) 



no branch point exists since the numerator in eq. (37 1 is negative. That means that for c > c* the pole z p (w,s) is 
always the dominant singularity of Z(z, s) and the molecule is always in the unfolded state. 



Thermal phase transition for 2 < c < c* at w — w c Only for 2 < c < c* a thermal phase transition is possible. 
For w > w c , see eq. (28), th e m olecule is in the folded phase governed by the branch point singularity Zb, which is 
determined by eqs. ( 12 ) and ( |16[ ). Decreasing w, i. e. increasing the temperature, causes the branch point and the pole 
to approach each other. At the critical point w c , eq. ( |28[ ), both singularities coincide and a phase transition occurs. 
For higher temperatures, w < w c , the RNA is unfolded and described by the pole z p , eq. (22). See fig. [6^i for an 
illustration. It will turn out that the temperature induced phase transition at zero force is very weak and that, in fact, 
the order of the phase transition is n, where n is the integer with (c — 2) _1 — 1 < n < (c — 2) . 



3.1.2 Force induced phase transition 



For the force induced phase transition the situation is slightly different as the position of the pole z p depends on the 
force, which enters via the weight of a non-nested backbone bond s, eq. (22). In contrast, the branch point Zb does 
not depend on s and hence it is constant, eqs. (12) and (16). Therefore, the branch point Zb and the critical point z c 
coincide and z\> can be determined exactly by the relation Zb — z c — z p (w c , s c ) = const. 



No force induced phase transition if w < w c (s — 1) or c > c* If the molecule is already in the unfolded phase, 
which can be due to high temperature, w < w c , or due to the non-existence of a branch point, c > c*, a force induced 
phase transition is not possible. In these cases the pole always dominates the system, regardless of the value of the 
applied force. 



Force induced phase transition if w > w c (s = 1) and c < c* A system below the melting temperature, w > 
w c (s — 1), is in the folded phase at zero force, s = 1. For small forces, i.e. s < s c , the system is described by the 
branch point singularity Zb, which is independent of s and hence does not depend on the force, eqs. (12 1 and (16) 
and fig. ^jp. However, as the pole z p (w, s) is a monotonically decreasing function of s, the branch point and the pole 
will eventually coincide at s = s c and a phase transition occurs. The critical force fugacity s c is defined as the root of 
cq. (28) for fixed w. For s > s c the pole is the dominant singularity and governs the system. Note that in contrast to 
the thermal phase transition, a force induced phase transition is possible even for c < 2. It will turn out that the force 
induced phase transition is second order in accordance with previous results [T71 [37] . 



3.2 Global phase diagrams 

Eq. ( [28] ) determines the phase diagram. In fig. [7^, we show the phase transition between the folded and unfolded states 
of a homopolymeric RNA in the w-s plane for a few different values of the loop exponent c that correspond to an ideal 
polymer, c = 3/2, and values between c = 2.1 and c = 2.3 as they are argued to be relevant for terminal and internal 
loops of varying topology including the effects of self-avoidance. Below the transition lines in fig. [7^,, the chain is in 
the unfolded (extended) state, above the line in the folded (compact) state. With growing loop exponent, the extent 



of the folded phase shrinks. In fact, w c (s,c) diverges for c — » c*, where c* ps 2.479, cf. eq. (38). Thus, for c > c* only 
the unfolded phase is present, as observed earlier in the restricted case of zero force, s = 1 13j70n the other hand, the 
critical line for c —¥ 2 and s = 1 goes down to zero, w c (s = 1) — > for c — > 2, which indicates that if no external force 
is applied to the molecule, there is only the folded phase for c < 2 and hence no thermal phase transition is possible. 
However, applying a sufficient force can drive the molecule into the unfolded state, as derived earlier. Concluding, only 
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Figure 7: (a) Phase diagram of homopolymeric RNA in the w-s plane for different values of the loop exponent 
c = 1.5, 2.1, 2.16, 2.3 featuring an unfolded phase (bottom right) and a folded phase (top left). For c = c* m 2.479, 
the phase boundary approaches s = 1 and the melting point w c (s = 1) diverges; therefore only the unfolded phase 
exists for c > c*. For c < 2, there is no melting transition at zero force as w c (s = 1) = 0. Thus if no force is applied, 
the system is always in the folded phase regardless of the temperature. The molecule can be denatured, though, by 
applying an external force even for c < 2, as can be seen from the phase boundary for c = 1.5. The filled circles 
denote the thermal denaturation transition point w c (s = 1) in the absence of an external force for c = 2.1, 2.16, 2.3. 
(b) Phase diagram in the w-c plane. With zero force, s — 1 (solid line), the weight w c drops to zero for c — > 2 and 
diverges as c — > c* ~ 2.479. For finite force, s > 1 (dashed line), a phase transition is possible, even for c < 2. (c) Phase 
diagram in the F-T plane for c = 2.3. Below the phase boundary the folded state is present, above the unfolded phase. 
Re-entrance at constant force is observed, as reported by Miiller |57| . 



for 2 < c < c* a thermal phase transition, denoted by the filled circles in fig. [7^,, is possible. A force induced phase 
transition is possible whenever c < c* and w > w c . For small force and c > 2, eq. ( |28[ ) can be expanded around s = 1 
and yields the universal asymptotic locus of the phase transition 

w c (s) ~ w c (s = 1) + (1 - s) c-2 r (2- c) 

(Cc-1 ~ KcY 

In fig. [7]d we show the critical line w c (s,c) for two different values of s as a function of the loop exponent c. In 
the absence of an external pulling force, i.e. for s — 1 (solid line), the transition line only occurs in the limited range 
2 < c < c* « 2.479. It is seen that for loop exponents around the relevant value of c w 2.1, the critical base pairing 
weight is quite small and of the order of w c ~ 0.1. A base pairing weight smaller than unity corresponds to a repulsive 
base pairing free energy that is unfavorable. This at first sight paradoxical result, which means that the folded phase 
forms even when the extensive part of the base pairing free energy is repulsive, reflects the fact that the folded state 
contains a lot of topological entropy because of the degeneracy of different secondary structures. The consequences 
for the theoretical description of systems, where single stranded nucleic acids occur, including DNA transcription, 
denaturation bubbles in dsDNA, untwisting of nucleic acids [H] . and translocation will be briefly discussed in 
section [4] 

The phase diagram, eq. | |28[ ), can also be displayed in the F-T plane by virtue of eqs. ([!]) and ([6| and is shown in 
fig. [7fc. Here, re-entrance at constant force becomes visible, in line with previous predictions by Miiller [37 . Expanding 



eq. "B| we obtain s(F) ~ 1 + (b ss F/(kBT)) 2 /6, for F —> 0. Eq. (39) yields the scaling of the critical force close to the 
melting temperature as 

F c ^{T m ~Tf'^ , (40) 

which depends on the loop exponent c and deviates from the predictions by Miiller }37] , who found a universal exponent 
1/2. 



3.3 Thermodynamic quantities and critical exponents 

We now consider the thermodynamic and critical behavior of various quantities. An arbitrary extensive quantity Y 
with the conjugate field / is obtained from the grand potential <!> = — keTlnZ via differentiation with respect to / 
and the chemical potential \x held constant 



(41) 
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Figure 8: Fraction of paired bases as a function of temperature for w — exp(e/ (keT)) and various c = 0.8, 1.5, 2.1, 2.3 
at zero force, s = 1. A phase transition is observed only for c = 2.3 for the range of positive values of e considered here 
(indicated by the filled circle), since for c < 2.195 the critical weight of a hydrogen bond is w c < 1, which can only be 
obtained for e < amounting to a repulsive interaction. For c < 2 no thermal phase transition can be observed at all. 
The inset shows the third derivative 6"' = d 3 8/dT 3 for c = 2.3, which reveals the phase transition. 



To evaluate the behavior of Y in the thermodynamic limit, N — > oo, one sets [i — > fi^, where ^ is defined as the 
chemical potential, at which N(fi) = —d<P/d/i diverges, i.e. N(fi) — > oo for \i — > /id- Another route to obtain Y is to 
conduct the calculation in the canonical ensemble, i. e. N — const, and to use the dominating singularity [16] . where 



Y = 



dG 
df 



k B TN 



(42) 



see eq. (15 1. In fact, for N —> oo eqs. (41) and (42) are equivalent and /id is associated with the dominating singularity 
of Z(z, s), namely Zd = exp(/^d/(kBT)), which will be shown now. The Gibbs free energy Q and the grand potential <P 
are related via a Legendre transform 



G(N) =$ + n(N)N 



Therefore, 



Y = k B TN 



91nzd 



df 



N 



d{g - g 

df 



= N df 



dfx 



Of! 
f df 



df 



(43) 



(44) 



where the two final expressions are evaluated at fi = /id- While performing derivatives of the dominant singularity 
and of the function k(w, z) is straightforward for z p and k p , see eq. (22), one has to employ implicit differentiation of 



cq. ( 16 ) to obtain the derivative of z^ and Kh, see supplementary material. For the latter case it turns out that eq. (41 1 



is more convenient to work with. 



3.3.1 Fraction of paired bases 

The fraction of paired bases is 

1 d\nZ N 
N (9 In w 

We obtain 

2Li c (z h K(w, z h )) 



Li c _i(z b K(w,z b )) 
in the folded phase (T < T c , F < F c ) and 
1 



1 



^1 + 4101^(1/3) 



(45) 



(46) 



(47) 
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Figure 9: Fraction of paired bases as a function of force for various w and c. The phase transition is visible as a 
kink in the curves and indicated by a filled circle, (a) c = 2.1 and varying w = 2, 4, 10. (b) w = 4 and varying 
c= 0.8, 1.5, 2.1, 2.3. 



in the unfolded phase (T > T c , F > F c ). In fig. [8] the temperature dependence and in fig. [9] the force dependence of 9 
is shown. The singularity at the critical point of the thermal phase transition for zero force, s = 1, is very weak and 
becomes visible in the n th derivative, with n being the integer with (c — 2) _1 — 1 < n < (c — 2) -1 , see supplementary 
material. The n th derivative exhibits a cusp, see inset of fig. [8j 

*^p. x \T-T m \** + const, (48) 

which is characterized by the critical exponent At = (c— 2) _1 — n for T < T m and At = 1 for T > T m , see supplementary 
material. The force induced phase transition is continuous, too, yet it exhibits a kink in 0(F) 

9{F) cx \F ~ F c \ x " + const , (49) 

which is characterized by the exponents At = for F < F c and At = 1 for F > F c . Therefore, the force induced 
phase transition is second order in accordance with previous results [171 157] . We note that for T — >• oo, i.e. w —> 1, a 
finite fraction of bases are still paired. The situation is different for the force induced phase transition where 9^0 
for F — > oo. 



3.3.2 Specific heat 

The specific heat is defined as 
■k B T d 2 T\nZ N 

dT^~ ■ (50) 



One observes that the specific heat in fig. 10 exhibits only a very weak dependence on the loop exponent, which 
stands in marked contrast to the findings for the short explicit sequence of tRNA-phe O [21] , where a pronounced 
dependence of the heat capacity on c is observed. The non-analyticity of 6, eq. (48), translates into a divergence of 
the n th derivative of the specific heat at the melting temperature 

d ^Pcx|T-T m |-^ + const, (51) 

with the critical exponent x = n ~~ (3 — c)/(c — 2) for T < T m and X = 1 for T > T m . This singularity is illustrated in 
the inset of fig. [10] for c = 2.3. 



3.3.3 Fraction of non-nested backbone bonds 

The fraction of non-nested backbone bonds is obtained by 
1 dhxZ N 

T - N dins (52j 
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Figure 10: Specific heat as a function of temperature for different loop exponents c = 0.8, 2.3. The non-critical behavior 
of the heat capacity curve depends on the loop exponent only marginally. However, the existence and position of the 
critical point (indicated by the filled circle) and the critical behavior depend on c. The inset depicts the third derivative 
of the specific heat, C" = d 3 C/dT 3 , for c = 2.3 revealing the phase transition. 




Figure 11: (a) Fraction of non-nested backbone bonds r as a function of temperature for c = 2.3. For c < 2.195 the 
critical weight of a hydrogen bond is w c < 1, which can only be obtained for e < 0, amounting to a repulsive interaction. 
Thus, for c < 2.195 or T < T m all segments are parts of loops or helices and hence r = 0. Filled circles indicate the 
melting temperature T m . (b) and (c) fraction of non-nested backbone bonds as a function of force for various w and 
c. Again, the phase transition is visible as a kink in the curves and is indicated by a filled circle, (b) c — 2.1 and 
varying w = 2, 4, 10, 50. (c) w = 50 and varying c = 0.8, 1.5, 2.1, 2.3. Filled circles indicate the position of the phase 
transition. 



and is 

r = (53) 
in the folded phase, as Zb does not depend on s, and reads 

r=l 2 W Li e .i( l/«) (M) 

1 + 4tuLi c (l/s) + V 1 + 4u;Li c (l/s) 

in the unfolded phase. For T — > oo the fraction of non-nested backbone bonds assumes a finite value smaller than one, 

which again indicates that the denatured phase in our model features pronounced base pairing. However, for large 



force F — > oo on invariably obtains r — > 1. As can be seen nicely in fig. 11 both t(T) and t(F) feature a kink at the 
critical point. 



3.3.4 Force extension curve 

The force extension curve is closely related to the fraction of non-nested backbone bonds r. The extension per monomer 
is given by 

k B TdlnZ N k B T dlnZ N d\ns turam \ K/ori \\ u navu \ r«.K\ 

x(F) = — -^r- = giQs -Qp- = b ss r (coth(/3F6 ss ) - l/((3Fb ss )) = b ss r £(/3F6 ss ) , (55) 
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Figure 12: Force extension curve as a function of force for various w and c. The phase transition is indicated by 
filled circles and occurs at zero extension and a finite threshold force needed to unfold the compact folded structure, 
(a) c = 2.1 and varying w = exp(e/k B T) = 2, 4, 10, 50. (b) w = 50 and varying c = 0.8, 1.5, 2.1, 2.3. Additionally the 
force extension curve of a freely jointed chain is plotted, which is the limiting form for w = 0. 



Since the Langevin function £ is a smooth function, the critical behavior of x(F) is governed by the behavior of 
the fraction of non- nested bonds r. As can be seen in fig. |12fc ,, the stretching behavior of the Langevin function is 
approached as the base pairing weight decreases, otherwise pronounced deviations are seen in the force-stretching 
curves. Also, a finite stretching force to unravel the folded state is needed. From fig. [12] it becomes obvious that the 
force induced phase transition is second order as the force extension curve exhibits a kink at the critical force denoted 
by filled circles. The force extension curves are in accordance with previous results [T71 137] . 



4 Implications for DNA melting 

How do the previous results impact on the theoretical description of the denaturation of double stranded nucleic acid 
systems, particularly DNA melting? When double stranded DNA approaches the denaturation transition, more and 
more inter-strand base pairs break up and loops proliferate. In the traditional theories based on the Poland-Scheraga 
model [3U S3] , the possibility of intra-strand base pairing was not considered. These models are thus accurate for 
duplexes formed between strands with sequences [AG]jv/2 and [TC]jv/2j where indeed base pairs (between A and T 
and between G and C) can only form between the two strands, not within one strand. For the case of duplexes formed 
by two strands with the sequence [AT]jy/2 or [GC]jv/2j both intra- and inter-strand base pairs can form and have 
identical statistical weights. In this case, our model predicts significant modifications for the duplex melting scenario. 

The above sequence examples are prototypes for two extreme cases of the general scenario characterized by statis- 
tical weights w and w for intra-strand and inter-strand pairing, respectively. Duplexes formed between [AG]tv/2 and 
[TC]jv/2 are characterized by w = 0, whereas duplexes formed between two [AT]jv/2 or [CG]jv/2 strands are charac- 
terized by w = w. Intermediate values of w and w can be achieved experimentally, for example, by [ATT]jy/3 and its 
complementary sequence [AAT]jv/3, in which case only 2/3 of the intra-strand base pairs can be of the Watson-Crick 
type and which would effectively lead to a lower weight of intra-strand base pairs, i. e. w < w. In naturally occurring 
DNA, a similar situation might be present above the glass transition where, for certain sequences, self-hybridization 
in a single strand is possible to some extent [33]. The weights of inter-strand and intra-strand base pairs can also be 
changed by applying an external force or torque on the duplex, for instance in the setup by Leger et al. |41j . We expect 
the weight w of inter-strand base pairs to decrease when the duplex is untwisted. This might lead to denatured regions 
in the duplex, where secondary structure can form if the sequence allows. We speculate that subsequent pulling first 
leads to the denaturation of the secondary structures, whose signature would be a threshold force around 1 pN-10 pN 



[T71 |2~4"] , see fig. 12 followed by the over-stretching transition of DNA [IB]. According to our previous arguments, a 
marked dependence on the sequence should be observed. 

Let us now review the classical Poland-Scheraga calculation [42] [43] and allow - in addition to the standard 
treatment - for intra-strand base pairing in denatured regions of the duplex, see fig. |13h ,. The grand canonical partition 
function of the two-state Poland-Scheraga model reads 

z = (i + z M ) f^(z B z M )«(i + i B ) - 1 = z b + ^m + 2z b z m (56) 



k=0 



1 - ZbZ, 



M 
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Figure 13: Illustration of a double stranded DNA molecule near the melting transition, (a) If intra-strand base pairing is 
possible, in inter-strand loops the vast majority of bases will form intra-strand secondary structure elements, (b) If the 
sequence does not favor intra-strand base pairs inter-strand loops without secondary structure will form, (c) Illustration 
of an asymmetric loop. 



where 



JV 



zw 



N=l 



1 



ZW 



(57) 



is the grand canonical partition function of a duplex in the bound state with w the weight of an inter-strand base 
pair and z now plays the role of the fugacity of a base pair. We distinguish between three different scenarios for a 
denatured or molten region inside the double strand, which is characterized by the partition function Zm, dep ending 
on the sequence and the intra-strand base pairing weight w: (i) for w > w c , with w c defined by eq. (28 1, intra- 



strand secondary structures in the folded (low temperature) phase are present, (ii) for w < w c intra-strand secondary 
structures in the unfolded (high temperature) phase form, and (iii) for w = large denaturation bubbles occur without 
intra-strand base pairing, corresponding to the original Poland-Scheraga model. The grand canonical partition for 
case (i) is characterized by the branch point singularity, eq. p0|, 



N=l 



(58) 



with c g = 3 and where the square is due to the fact, that on either strand secondary structures may form; TV- 
independent factors have been neglected as they do not affect the critical behavior. One realizes that the effective 
loop exponent c c s in this case is universal and independent of the configurational entropy of an inter-strand loop 
characterized by the exponent c. For case (ii), the secondary structures on each strand are in the unfolded phase 
and the partition functions are characterized by the pole, eq. ( p6| . As the number of non- nested segments, which is 
proportional to N, is non-zero, eq. (54), an inter-strand loop decorated with helices occurs, and is characterized by 



the loop exponent c. The respective grand canonical partition function reads 



7 n 



E 

N=l 



z N (z~ N ) 2 N-t = U £ (z/zl) 



(59) 



The third case constitutes the classical Poland-Scheraga model, where no base pairing is present and the loop exponent 
c describes the loop statistics. The partition function reads 



£m = E zNn ~" = u ^ 

N=l 



Combining eqs. ( 56|[60 ) yields the grand canonical partition function of a nucleic acid duplex 

4 _ wz + (1 + wz)Lic ot . f (z/z b ) 
1 - wz(l + Li aeff (z/z b )) 

with (i) c eff = 3, z b = z\, (ii) c e g 



(60) 



(61) 



c = 2.1, z h = Zp, or (iii) c e 



2.1, Zb = 1, respectively. 



The thermodynamics of nucleic acid duplexes is governed by the singularities of the partition function, eq. (61). 



The si ngu larities are readily recognized as the branch point £b of the polylogarithm |40j and the pole z p of the fraction 
in eq. (61 1, which is the root of the denominator 



1 = w)z p (l + Li£ cff . (zp/z b )) 



(62) 
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Figure 14: (a) The critical inter-strand binding energy In w c as a function of the inter-strand loop exponent c for different 
intra-strand base pairing weights w and different intra-strand loop exponents c. (i) for w > w c , with w c (c — 2.1) w 0.16, 
see eq. ( [28] ) , inside denatured regions of the duplex secondary structures that are folded occur on each strand, (ii) for 
w < w c secondary structures that are unfolded form on each strand, (iii) for w — the classical Poland-Scheraga result 
is obtained, where no intra-strand base pairs form, (b) Phase diagram of a nucleic acid duplex in the plane spanned 
by the intra-strand pairing weight w and the inter-strand pairing weight w for symmetric (solid line) and asymmetric 
(dashed line) inter-strand loops for c = c = 2.1. Left to these curves the duplex is in the molten (M) phase, to the 
right the duplex is in the bound (B) phase. The dotted line at w = w c (c = 2.1) ps 0.16, see eq. (28), depicts the phase 
boundary between folded (F) and unfolded (U) intra-strand secondary structures for c = 2.1, which separates cases (i) 
and (ii), respectively. 



At the critical point, the pole and the branch point coincide, Zb = z p , which yields 

*c = r-7T— 7 — r , (63) 



where (g eff = Lig off (l) is the Riemann zeta function. In fig. 14 1 the critical inter-strand base pairing weight w c as 
a function of the inter-strand loop exponent c is shown. The solid line depicts the phase diagram of the classical 
Poland-Scheraga model (iii) characterized by w = 0, where no secondary structures occur in denatured regions, cf. 
fig. |13b . The dashed line depicts the phase diagram for w = 0.1, where secondary structures occur that are in the 
unfolded state since w = 0.1 < w c for c = 2.1, (ii). One sees that the modifications to the w = limit are rather 
small. As in the case of the classical Poland-Scheraga model there is no phase transition for c < 1, a second order 
phase transition for 1 < c < 2, and a first order transition for c > 2 [3011331133]. The situation is different for case (i) 
with w — 1, where the secondary structures are in the folded and the inter-strand loop exponent is replaced by the 
universal value c e g = 3, which renders the denaturation transition first order and independent of c. In fig. |14b the 
phase boundary in the w-w plane is shown. The phase boundaries arising due to inter-strand base pairing (solid or 
dashed line) and intra-strand base pairing (dotted line) section the phase space into four quadrants, where the duplex 
is either in the bound (B) or molten (M) state and the intra-strand secondary structures are either folded (F, case (i)) 
or unfolded (U, case (ii)). Experimentally, a variation of temperature corresponds to a straight path through the origin 
of the phase diagram in fig. |14b , which, depending on the values of the inter- and intra-strand energies e and e, may 
cross a number of different phases. 

There are 2 N ways of constructing an asymmetric inter-strand loop |44j , where the number of bases in the lower 
and the upper part of a loop is not required to be identical, see fig. |13| : for an illustration. This additional factor 
reduces the inter-strand loop exponent c by 1. Therefore, the duplex denaturation transition is for case (i) right at 
the threshold between continuous and discontinuous transitions, because c e ff = 2, and for cases (h,ih) a continuous 
transition as c e g — 1.1. The consequences on the phase behavior are illustrated in fig. [Tip by broken lines. We add 
that the results for the competition between intra- and inter-strand base pairing are obtained using a factorization 
approximation for the two strands making up an inter-strand loop. In the appendix we show that this factorization is 
accurate in the thermodynamic limit of diverging loop size. 

As a main result, the effective inter-strand loop exponent is rcnormalized and takes on universal values for case (i), 
where intra-strand base pairing leads to secondary structures in the folded phase. The formation of intra-strand 
secondary structure influences the melting temperature and has implications for the determination base pairing free 
energy parameters [33J |3S] and other biotechnological applications where DNA melting and hybridization is involved. 
Thus, intra-strand interaction might be important to include in software packages predicting the stability of nucleic 
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acids based on a Poland-Scheraga scheme. Algorithms for cofolding of multiple nucleic acids already account for 

this gsgg. 
5 Conclusions 

The partition function of RNA secondary structures has been evaluated including arbitrary pairing topologies in the 
absence of pseudoknots, including the configurational entropy of loops in the form of the loop length dependent term 
gconf _ _ k B n nm - c . Exact expressions for the fraction of paired bases, the heat capacity, and the force extension 
curves are derived in the presence of an external pulling force. The observed thermal phase transition is very weak 
and of higher order, the force induced transition is found to be second order. The critical behavior and the critical 
exponents are found to depend on the loop exponent c. A temperature induced melting transition is only possible 
for 2 < c < c* « 2.479. Our theory has consequences on the denaturation of double stranded DNA molecules, in 
particular when intra-strand base pairs as well as inter-strand base pairs can form. In this case, the native double 
strand is in competition with intra-strand base pairing effecting the secondary structures discussed in this paper. 
Future directions will include loop exponents, that depend on the number of helices emerging from a given loop, 
treatment of pseudoknots, and cofolding nucleic acids to study the influence of intra-strand interactions during double 
strand denaturation. 

Financial support comes from the DFG via grant NE 810/7. T.R.E. acknowledges support from the Elitenetzwerk Bayern within 
the framework of Complnt. 

A Appendix 

The statistical weight of a TV base pair long molten region in the duplex with intra-strand interaction, as depicted in 
fig. [13k, is given by 



A f)Mf)M' 

Zn - 2^ S (M + M' + 4) £ ' (64) 

M,M'=0 v ' 

where c = 2.1 is the loop exponent describing inter-strand loops in DNA, M + M' + 4 counts the number of non-nested 
back-bones that contribute to the loop entropy, and Qj$ is given by eq. Q. The denominator in eq. (64 1 is due to the 



loop entropy and amounts to an effective interaction between the two strands as the expectation value of M depends 



on M 1 and vice versa. The asymptotic behavior of Z N NA can be estimated by establishing two inequalities. The first 



N N 



ZT A <* 4 E sM Qn E s Qn = s"Z N Z N , (65) 



M=0 M'=0 



where the scaling of Zn is given by eq. ( 20 ) or ( 26 1 depending on whether the secondary structures are in the folded 



or unfolded phase, respectively. The upper scalingboundary is obtained by factorizing eq. (64) and therefore removing 
the effective interaction, but retaining the loop entropy for each individual strand 

N „M+2AM N „M'+2f)M' 
7 DNA \ "* s VjV \ " s (f-p\ 

Z n > 2^ {M + 2) c (M' + 2f ■ {bb) 

M=0 y ' M'=0 y 1 



The scaling of eq. ( 66 1 follows from the dominant singularity analysis of the generating function 



00 N f\M i 



(z, s) = s 2 z 2 V V s M z N - -^—LUszniw, z)) - sz . (67) 

V ' > M + 2 c k(w.z) v v ;; y ' 

N=0M=0 y ' K ' ' 

Z^ DNA (z, s) features the same singularities as Z(z,s), see eq. namely the branch point of Zb of n(w,z), and the 
singularity z p given by the condition 1 = szk(w, z), see eq. (|22| 



By virtue of the inequalities ( 65 ) and ( 66 ) we conclude that 
Z° NA cx z h N N- £ ^ (68) 
with c e ff = 3, £b — z 2 for case (i) and c c g — c — 2.1, z\, = zz for case (ii) for symmetric molten loops. 
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A Derivatives of k, 

In the main text we derived the three constitutive equations determining the thermodynamic behavior of the system 
w 

k(w,z) — l-\ -~Li c (zn(w,z)) (Sla) 

k{w, z) 

k(w, z h (w)) 2 = wLi c _!(z b (w)K(w, z h (w))) - wU c (z h (w)n(w, z h (w))) (Sib) 
sz p (w, s)k(w, z p (w, s)) = 1 . (Sic) 

The function n(w,z) is the smallest positive root of eq. (Sla). Derivatives of k are obtained by implicit differen- 
tiation. To obtain the derivative of k with respect to an arbitrary quantity one differentiates both sides of eq. (Sla) 
with respect to this quantity and solves for the desired derivative. Explicitly, 

9k = wkU c _ 1 (zk) 

dz z(k 2 + wLi c (zn) — w~L\ c -i{zn)) 

dn = kU c (zk) ^ 

dw k 2 + wLi c (z/i) — wL1 c _i(zk) 

dn kw lnzLi c _i(z«;) + lnwLi c (z«;) 

df ~ f~ ' k 2 + wLi c (zn) - wU^^zk.) ' ^ ' 
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B Expansion of the branch point for T < T m 



To obtain the critical behavior for T < T m at zero force, s = 1, we perform an asymptotic expansion of eqs. (Sla) 
and (Sib) around the critical point, where the branch point and the pole coincide. Thus, at the critical point all 
eqs. (SI) have to hold and we obtain the critical values exactly as 

«c = ^(l + V / l + 4<c) , ^c = 2(l + v / l + 4< c ) _1 , and w c = ^^j 2 ■ (S5) 

As an ansatz for Zb(T) and Kb(T) we use a power series in d = w/w c — l, where w — exp[e/(kBT)] is the Boltzmann 
weight of a base pair. To simplify notation, we define a = (c — 2) _1 and write 

Zb/zc ~ 1 + aid + a 2 d 2 + . . . + d a (a a + a a+ id + a a+2 d 2 + ...)+ a 2a -id 2a ~ 1 + . . . (S6a) 
Kb/Kc ~ 1 + bid + b 2 d 2 + ... + d a (b a + b a+1 d + b a+2 d 2 + ...)+ b 2a ^d 2a - x + ... (S6b) 

Plugging the ansatz (S6) into eqs. (Sla) and (Sib) we can solve order by order. To do so, the series representation of 
the poly logarithm Li v (x) around x = 1 is used [1] 

U v {x) ~ Cu - C-i(l - x) + i(C„- 2 - (v-i)(x - l) 2 + ... (1 - x) v ~ l (r(l - „) + 1(1 - u)r{l -v){l-x) + ..y (S7) 

We obtain the coefficients for Zb 

ai = --^- (S8a) 

Cc-l 



a2 = 7 |^(2C c -i-Cc) (S8b) 



«3 = - S- (5C C 2 - 1 - 6C C - 1 Cc + 2C 2 ) (S8c) 

a 4 - i^(14C c 3 -i - 28C 2 _!Cc + 20C c -iC 2 - 5C C 3 ) (S8d) 

Sc-l 

a 5 = - (42C c 4 _i - 120C 3 _iCc + lSSC^rC 2 - 70C c -iC c 3 + 14C C 4 ) (S8e) 

a Q = (S8f) 

■^,=- r(i - c ' +r( '- !) r- g -- 3C -:?' +2g r' <ssg) 



Cc-l V ^(2 - c)Cc-i 

(r(l - c) + r(2 - c)) (C e 2 _x + 4(c - 2)C c -iCc + (5 - 3c)C 2 ) / C 2 -i-3Cc-iCc + 2C 2 i 

aa+2 " ^2)^ I ^-cKc-x ' (S8h) 



a 2a -i = 



(S8i) 
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We obtain the coefficients for «b 

h = ^- (S9a) 

Cc-1 

62 = -7^-(Cc-i-Cc) (S9b) 
h = 2-$- (Cc-i -Ccf (S9c) 

<>C-1 

fo 4 = -5^- (Cc-i -Cc) 3 (S9d) 

Sc-l 



'a+l 



c 5 

Y 9 



^^(Cc-i-Cc) 4 (S9e) 



r(i - c) + r(2 - c) / c c 2 -i - 3C c -iCc + 2C C 2 ^ a+] 



Cc-i V ^(2 - c)C 



c-1 



, ( Ctl - 3Cc-lCc + 2C C 2 \ " Cc 2 -1 - (C - 2)Cc-lCc - Cc 2 /oq \ 

H ^-cKc-! J (S9g) 

Cc-i-Cc / C c 2 -i-3Cc-iCc + 2C 2 \" 

"+ 2 2(c - 2)2 . T(2 - C )C c 4 _i I ^(2 - c)Cc-i y 

x ^2(c - 2)r(l - c)(Cc_i - 2Cc)(C 2 -i + 2(c - 2)C c -iCc + (5 - 3c)C c 2 ) 

+ T(2 - c)((c - 3)C c 3 _x + (c - 3)(4c - 7)C c 2 _iCc - (3c - 5)(4c - 9)Cc-iC c 2 + (3c - 5)(4c - 7)C C 3 )) 

(S9h) 

Cc-2 - 3Cc-i + 2Cc / C 2 _i - 3Cc-iCc + 2C 2 



J2a-1 — 



(c-2)r(2-c) V r(2-c)Cc-i 



(S9i) 



Remarkably, the expansion of the product z^k^ yields 

z b Kb ~ 1 + (o Q + b a )d a + (a a+1 + b x a a + b a+1 + ai b a )d a+1 + 0(d a+2 ), (S10) 

meaning that all integer powers d n , for n < a, vanish, n is the integer with (c — 2) _1 — 1 < n < (c — 2) _1 . This fact 
renders the transition of high order when c — > 2 and thus a^oo. For the fraction of bound bases this can be seen 
directly by expanding the low temperature expression 

g= U c (z hKh ) 

Ll c _l(z b Kb) 

using eqs. (S7) and (S10). 
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